Predicting Execution Bottlenecks in Map-Reduce Clusters

نویسندگان

  • Edward Bortnikov
  • Ari Frank
  • Eshcar Hillel
  • Sriram Rao
چکیده

Extremely slow, or straggler, tasks are a major performance bottleneck in map-reduce systems. Hadoop infrastructure makes an effort to both avoid them (through minimizing remote data accesses) and handle them in the runtime (through speculative execution). However, the mechanisms in place neither guarantee the avoidance of performance hotspots in task scheduling, nor provide any easy way to tune the timely detection of stragglers. We suggest a machine-learning approach to address these problems, and introduce a slowdown predictor – an oracle to forecast how much slower a task will run on a given node, compared to similar tasks. Slowdown predictors can be embedded in the map-reduce infrastructure to improve the agility and timeliness of scheduling decisions. We provide initial evaluation to demonstrate the viability of our approach, and discuss the use cases for the new paradigm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessment of Clustering Methods for Predicting Permeability in a Heterogeneous Carbonate Reservoir

Permeability, the ability of rocks to flow hydrocarbons, is directly determined from core. Due to high cost associated with coring, many techniques have been suggested to predict permeability from the easy-to-obtain and frequent properties of reservoirs such as log derived porosity. This study was carried out to put clustering methods (dynamic clustering (DC), ascending hierarchical clustering ...

متن کامل

Dimensioning Scientific Computing Systems to Improve Performance of Map-Reduce based Applications

Map-Reduce is a programming model widely used for processing large data sets on scientific clusters. Most of the efforts and research are focused on enhancing and alleviating the drawbacks of the model proposed by Google. The requirements of Map-Reduce based applications are often unclear because of the difficulty in satisfying the overall system throughput, as well as exploring alternatives to...

متن کامل

Overcoming performance bottlenecks in using OpenMP on SMP clusters

This paper presents a new parallel programming environment called ParADE to enable easy, portable, and high-performance computing for SMP clusters. Different from the prior studies, ParADE separates the programming model from the execution model: it enables shared-address-space programming while it realizes hybrid execution of message-passing and shared-address-space. To overcome the poor perfo...

متن کامل

An intermediate data placement algorithm for load balancing in Spark computing environment

Since MapReduce became an effective and popular programming framework for parallel data processing, key skew in intermediate data has become one of the important system performance bottlenecks. For solving the load imbalance of bucket containers in the shuffle process of the Spark computing framework, this paper proposes a splitting and combination algorithm for skew intermediate data blocks (S...

متن کامل

Large Scale Metagenomic Sequence Clustering via Sketching and Maximal Quasi-clique Enumeration on Map-Reduce Clusters

Taxonomic clustering of species from millions of DNA fragments sequenced from their genomes is an important and frequently arising problem in metagenomics. High-throughput next generation sequencing is enabling the creation of large metagenomic samples, while at the same time making the clustering problem harder due to the short sequence length supported and sampling of hitherto unknown species...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012